n (since we only run it a small number of
> docs). Doing asymmetric quantization (inspired by BBQ) at the rescoring
> phase, not only would we improve the recall but also the latency.
>
> On Tue, Feb 11, 2025 at 11:50 PM Michael Sokolov wrote:
>
> > Stored fields is a separate format
congratulations and welcome!
Mike
On Mon, Jul 21, 2025 at 6:13 AM Uwe Schindler wrote:
>
> Hi,
>
> Congrats and welcome!
>
> Uwe
>
> Am 19.07.2025 um 02:47 schrieb Adrien Grand:
>
> Hello everyone,
>
> I'm pleased to announce that Pan Guixin has accepted the PMC's invitation to
> become a commi
t; Just some additional info: From what I figured out, cherry-picking to
> stable branch is possible, although the sourceset and directories have
> different names, becazuse git is intelligent enough to recognize this
> like a file rename.
>
> Uwe
>
> Am 08.07.2025 um 22:00 sc
ava versions. Therefore the
> compilation is done using a stub:
> https://github.com/apache/lucene/tree/main/lucene/core/src/generated/jdk
>
> In addition when Panama Vector enters Preview mode (should happen hopefully
> soon after Java 25) we need a separate sourceset anyways.
&g
I'm curious why we have lucene/core/src/java24 directory/module on main branch
instead of moving these classes to lucene/core/src/java, now that JDK24 is
mandatory.
is sounds doable, but we never got to it.
>
>
>
> On Fri, Jun 27, 2025 at 2:19 PM Michael Sokolov wrote:
>
> > Without this temp file we would need to load the entire set of vectors
> > for the new merged segment into RAM in order to support building an
> > HNSW gra
Without this temp file we would need to load the entire set of vectors
for the new merged segment into RAM in order to support building an
HNSW graph from it. This way we can read the vectors off the disk in
the same way we would do during normal searches. I'm not sure, but I
think the temp file s
Welcome, Simon!
On Fri, Jun 20, 2025 at 9:32 AM Mikhail Khludnev wrote:
>
> Welcome Simon!
>
> On Fri, Jun 20, 2025 at 3:52 PM Simon Cooper
> wrote:
>>
>> Hi everyone,
>>
>> Many thanks for the invite! I'm delighted to accept, and look forward to
>> future Lucene development.
>>
>> I live in C
We've recently been comparing Lucene's HNSW w/FAISS' and there is not
a 2x difference there. FAISS does seem to be around 10-15% faster I
think? The 2x difference is roughly what I was seeing in comparisons
w/hnswlib prior to the dot-product improvements we made in Lucene.
On Thu, Jun 19, 2025 at
I would recommend including this commit
https://github.com/apache/lucene/commit/86091d379820f65eec0e0f57f075b665d843f31a
(sorry, no PR!) which is a small test fix needed along with removing
HNSW connectedCOmponents
On Wed, Jun 11, 2025 at 12:33 PM Chris Hegarty
wrote:
>
> Github milestones are no
wrote:
>
> I'm wondering if this is the same idea that Kaival is proposing in
> https://github.com/apache/lucene/issues/14758 (Support multiple HNSW graphs
> backed by the same vectors).
>
> On Thu, Jun 5, 2025 at 11:32 AM Michael Sokolov wrote:
>
> > I do think there c
key (customer id?) to the vectors somehow? If this was done
> > well it should lead to a natural clustering of the graph.
> >
>
> I can explore further on this. Thanks for the pointers..
>
> On Mon, Jun 2, 2025 at 11:14 PM Michael Sokolov wrote:
>
> > I wonder i
e docs range could vary in extremes from few 10s to tens-of-thousands
> and in very heavy usage cases, 100k and above… in a single segment
>
> Filtered Hnsw like you said uses a single graph.., which could be better if
> designed as sub-graphs
>
> On Mon, 2 Jun 2025 at 5:42 PM, Mic
How many documents do you anticipate in a typical sub range? If it's in the
hundreds or even low thousands you would be better off without hnsw.
Instead you can use a function score query based on the vector distance.
For larger numbers where hnsw becomes useful, you could try using filtered
hnsw,
The message is telling you that you previously indexed the field
boe.search.wild_description with offsets and now you are trying to
index it without offsets. This probably indicates you are using a
different Analyzer, which is generally *not ok* since indexed fields
must be indexed in a consistent
Welcome and congratulations, Ankit!
On Wed, May 7, 2025, 1:12 AM Michael Froh wrote:
> Congratulations Ankit! This is fantastic!
>
> On Tue, May 6, 2025 at 3:36 PM Alan Woodward wrote:
>
>> Welcome Ankit!
>>
>>
>> On 6 May 2025, at 20:05, Ankit Jain wrote:
>>
>> Thanks everyone for the warm we
Welcome, Stefan!
On Mon, May 5, 2025 at 1:36 AM Adrien Grand wrote:
>
> Congratulations and welcome, Stefan!
>
>
> Le lun. 5 mai 2025, 01:24, Vigya Sharma a écrit :
>>
>> Congratulations Stefan!
>>
>> On Sun, May 4, 2025 at 3:31 PM Andriy Redko wrote:
>>>
>>> Congrats Stefan!!!
>>>
>>> On Sun,
" to a
>> larger value, because Jenkins got much faster, so builds are recycled
>> faster.
>>
>> Uwe
>>
>> Am 14.04.2025 um 19:31 schrieb Michael Sokolov:
>> > Hi Uwe, I just want to see the logs output by failed tests in Jenkins;
>> > I have no need
oon, just a bit busy.
>
> Uwe
>
> Am 14.04.2025 um 13:43 schrieb Michael Sokolov:
> > I tried to investigate, but don't have (lost?) access to Policeman
> > Jenkins - I guess it was rebuilt recently? Is there a way to extend
> > access to Lucene committers at
I tried to investigate, but don't have (lost?) access to Policeman
Jenkins - I guess it was rebuilt recently? Is there a way to extend
access to Lucene committers at least?
On Sat, Apr 12, 2025 at 9:01 PM Policeman Jenkins Server
wrote:
>
> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux
I tried to investigate, but don't have (lost?) access to Policeman
Jenkins - I guess it was rebuilt recently? Is there a way to extend
access to Lucene committers at least?
On Sat, Apr 12, 2025 at 9:01 PM Policeman Jenkins Server
wrote:
>
> Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux
You can combine queries; they are composable. Whether it makes sense
or not for your use case is something you will have to decide. To me
it's hard to see a case where vector query 1 AND vector query 2 would
be preferable to combining the vectors "up front" (ie when creating
the vectors), but mayb
It makes sense to me. I think it's providing marginal benefits, and the
downside is bad
On Thu, Apr 3, 2025, 4:58 AM Benjamin Trent wrote:
> Hey y'all,
>
> Unless there is strong dissenting opinion, I think we should revert the
> connected components work in HNSW for 10.2 as a bug fix.
> https:/
Woohoo! thanks Uwe; exciting you were able to get 2x the lifespan of
the drives. Let's go for 4x this time!
On Tue, Mar 18, 2025 at 12:53 PM Uwe Schindler wrote:
>
> Moin moin,
>
> Policeman Jenkins got new hardware yesterday - no functional changes.
>
> Background: The old server had some stran
One idea I've heard batted around is to override
IndexSearcher.createWeight in a profiling IndexSearcher and then wrap
Weights and finally Scorers in order to emit metrics on every advance
or other low-level operation. This could help with search profiling I
guess although of course it will slow e
Woohoo! thanks Uwe; exciting you were able to get 2x the lifespan of
the drives. Let's go for 4x this time!
On Tue, Mar 18, 2025 at 12:53 PM Uwe Schindler wrote:
>
> Moin moin,
>
> Policeman Jenkins got new hardware yesterday - no functional changes.
>
> Background: The old server had some stran
Wilkommen, bienvenue, welcome, Froh!
On Thu, Mar 6, 2025, 4:29 AM Chris Hegarty
wrote:
> Congrats and welcome, Michael!
>
> -Chris
>
> > On 6 Mar 2025, at 08:05, Dawid Weiss wrote:
> >
> >
> > Hello everyone,
> >
> > I'm pleased to announce that Michael Froh has accepted the PMC's
> invitation
One thing to check is whether the synonyms are configured as
bidirectional, or which direction they go (eg is "a b" being expanded
to "ab" but "ab" is not being expanded to "a b"??)
On Wed, Mar 5, 2025 at 2:20 PM Mikhail Khludnev wrote:
>
> Hello Trevor.
>
> Maintaining such a synonym map is too
Stored fields is a separate format that stores data in a row-wise
fashion: all the stored data for a single document is written
together. Vectors aren't *also* copied into stored fields storage, so
the stored fields API can't be used to retrieve them. If we did allow
that it would result in massiv
I recently unsubscribed from issues@ because it seemed to me it was duplicative
with the emails I was already receiving from github. Is it true that they have
the same contents (with different formatting)? If so, should we drop issues@?
Or maybe it remains useful for users lacking github account
this one did not reproduce for me on main just now
On Wed, Jan 8, 2025 at 4:36 AM Apache Jenkins Server
wrote:
>
> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1535/
>
> 3 tests failed.
> FAILED: org.apache.lucene.misc.index.TestBpVectorReorderer.testQuantizedIn
this one did not reproduce for me on main just now
On Wed, Jan 8, 2025 at 4:36 AM Apache Jenkins Server
wrote:
>
> Build:
> https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1535/
>
> 3 tests failed.
> FAILED: org.apache.lucene.misc.index.TestBpVectorReorderer.testQuantizedIn
I noticed that at least some of the build failure emails coming from
policeman do not include the "reproduce with" command line. While we
can dig it out by clicking through to jenkins, it would be nice if it
were directly accessible in the email too (as it is on the ci build
emails). Is that possi
SUCCESS! [1:54:16.305613]
+1
On Mon, Dec 16, 2024 at 4:32 AM Ignacio Vera wrote:
>
> SUCCESS! [1:16:40.906176]
>
> +1
>
> On Mon, Dec 16, 2024 at 9:05 AM Luca Cavanna wrote:
> >
> > Ok, thanks for clarifying Jim, that means we go ahead with th RC1 vote,
> > unless there are objections.
> >
> >
That makes sense to me too in the abstract. At Amazon we also have
interesting BDV fields we have to decode on the fly, so this looks
attractive for that reason (not just faceting).
I would say though that it would be easier to evaluate the fitness for
purpose (faceting) if we had some examples of
Sparse is meaning two different things here. In the case you found Mikhail,
it means not every document has a value for some vector field. I think the
question here is about very high dimensional vectors where most documents
have zeroes in most dimensions of the vector.
On Tue, Dec 3, 2024, 2:01 A
Another way is using postings - you can represent each dimension as a
term (`dim0`, `dim1`, etc) and index those that occur in a document.
To encode a value for a dimension you can either provide a custom term
frequency, or index the term multiple times. Then when searching you
can form a BooleanQu
That's interesting! One thing I'd say is we don't want to be
optimizing for the random vector use case, so from that perspective
this is less concerning. However we also don't want to have poor
worst-case performance, so we should address this somehow. If you want
to probe for degenerate cases, yo
Do you actually use org.apache.lucene.replicator.http ? If not then
this wouldn't have any material impact on your application.
On Mon, Oct 28, 2024 at 4:25 AM Renaud SAINT-GRATIEN
wrote:
>
> CONFIDENTIAL
>
> Hello,
>
> Is there any plan to patch Lucene 8.11 for CVE-2024-45772 ?
> I need to stay
I think this might be a better question for solr-user@? EG I don't
understand how Solr decides which Query to send to populateScores --
is it the same one that was used to generate the matches in topDocs?
It seems as if it should be, but then this error shouldn't happen ...
I wonder if you can prin
Maybe we would want to have commit-triggered builds as well and those could
go in a common pool? We actually do that for PRs with GitHub now and it
seems to work out okay, but it's not exactly like having a build for each
commit to main
On Tue, Oct 15, 2024, 8:02 AM Jason Gerlowski wrote:
> > J
+1
SUCCESS! [1:25:19.133719]
On Thu, Oct 10, 2024 at 9:42 AM Stefan Vodita wrote:
>
> +1 SUCCESS! [0:35:35.851448]
>
> We didn't mention the new aggregation engine [1] in the release notes for 9.12
> and that seems like a missed opportunity. It's a significant change, which
> improves over the
SUCCESS! [1:24:53.393070]
On Thu, Oct 3, 2024 at 9:43 AM Benjamin Trent wrote:
>
> +1 SUCCESS! [0:56:38.403983]
>
> On Thu, Oct 3, 2024 at 5:51 AM Stefan Vodita wrote:
> >
> > +1 SUCCESS! [0:39:04.597088]
> >
> >
> > On Thu, 3 Oct 2024 at 07:48, Luca Cavanna wrote:
> >>
> >> Please vote for rel
um, +1
On Thu, Oct 3, 2024 at 10:39 AM Michael Sokolov wrote:
>
> SUCCESS! [1:24:53.393070]
>
> On Thu, Oct 3, 2024 at 9:43 AM Benjamin Trent wrote:
> >
> > +1 SUCCESS! [0:56:38.403983]
> >
> > On Thu, Oct 3, 2024 at 5:51 AM Stefan Vodita
> > wrot
Thanks, done
On Mon, Sep 30, 2024 at 10:39 AM Chris Hegarty
wrote:
>
> Hi Mike,
>
> > On 30 Sep 2024, at 15:35, Michael Sokolov wrote:
> >
> > Chris - I wonder if it would be OK to cherry-pick
> > f2b2bfc414873558bf8a18be3c40fe67939dd25e to branch_10_0? It is a
Chris - I wonder if it would be OK to cherry-pick
f2b2bfc414873558bf8a18be3c40fe67939dd25e to branch_10_0? It is a
doc-only update referring to a change that is on that branch.
On Mon, Sep 30, 2024 at 6:47 AM Chris Hegarty
wrote:
>
> Hi,
>
> In preparation for the upcoming Lucene 10 release:
>
>
Lucene's test framework makes heavy use of randomization in order to
explore more of the vast space of possible states. You might be
familiar with this as "fuzz testing"? There's a blog post about it
here (from 2011!)
https://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.htm
retrieved.
On Sat, Sep 28, 2024 at 5:15 PM Michael Sokolov wrote:
>
> These failures relate to the way Arrays.binarySearch works when there
> are repeated values, in which case the result is undefined (it can be
> any of the indexes with the value), but in SlowCompositeReaderWrapper
>
retrieved.
On Sat, Sep 28, 2024 at 5:15 PM Michael Sokolov wrote:
>
> These failures relate to the way Arrays.binarySearch works when there
> are repeated values, in which case the result is undefined (it can be
> any of the indexes with the value), but in SlowCompositeReaderWrapper
>
These failures relate to the way Arrays.binarySearch works when there
are repeated values, in which case the result is undefined (it can be
any of the indexes with the value), but in SlowCompositeReaderWrapper
we are relying on finding the lowest-indexed of the repeats. I'll work
on a fix
On Sat,
These failures relate to the way Arrays.binarySearch works when there
are repeated values, in which case the result is undefined (it can be
any of the indexes with the value), but in SlowCompositeReaderWrapper
we are relying on finding the lowest-indexed of the repeats. I'll work
on a fix
On Sat,
SUCCESS! [0:59:38.025612]
+1
On Fri, Sep 27, 2024 at 6:01 AM 张超 wrote:
>
> +1
>
> SUCCESS! [1:11:28.608954]
>
> 2024年9月27日 04:35,Dawid Weiss 写道:
>
>
> SUCCESS! [3:04:58.589040]
>
> +1. Thank you, Chris.
>
> On Wed, Sep 25, 2024 at 7:02 PM Chris Hegarty
> wrote:
>>
>> Please vote for release c
Hi Adrien, I thought we had another week? I looked back at Old emails and
thought you had targeted SEP 22 for feature freeze?
On Fri, Sep 13, 2024, 7:45 AM Adrien Grand wrote:
> Hello everyone,
>
> As previously discussed, I plan on feature freezing Lucene 9.12 and Lucene
> 10.0 next week. Prac
> If your two indexes load data sequentially and in the same order, then I
believe that you would get the same results. But we consider this an
implementation detail rather than a guarantee that Lucene should have.
You might even still be surprised by nondeterminism arising from
concurrency during
Hi, I've been looking into Adrien's suggestion to migrate
(Byte/Float)VectorValues to an unabashedly random-access API. We can
easily enough support iteration on top of that (which we use
extensively during indexing). I think this would represent a great
simplification; preliminary implementation s
Maybe getSlices has some side effect that messes up create Weight?
On Fri, Aug 16, 2024, 7:10 AM Michael Sokolov wrote:
> That is super weird. I wonder if changing the names of variables will make
> a difference. Have you verified that this effect is observable during all
> lunar phas
That is super weird. I wonder if changing the names of variables will make
a difference. Have you verified that this effect is observable during all
lunar phases?
I assume we liked at any profiler do offs we could get our hands on? If
not, maybe some for would show up there.
On Thu, Aug 15, 2024,
Yes, there is no support for upgrading a pre-8.x index to 9 or later.
At some point it was decided that supporting that would lead to grief
for users and/or hamper development of Lucene, so now you can only
upgrade one major version. If you need to do so, the best supported
option is to write a pro
You could switch to DocValues, and it would probably be more efficient
if you are only retrieving a single stored field but you have a lot of
other ones in the index since stored fields are stored together and
have to be decoded together. As far as visiting every segment on disk
I'm not sure what
(
TooComplexToDeterminizeException.class,
() -> {
new RegexpQuery(new Term("stringvalue", "(.*a){2000}"));
});
}
On Tue, Aug 6, 2024 at 10:56 AM Michael Sokolov wrote:
>
> Yes, I think degenerate regexes like *a* are potentially costly.
> Actually some
Yes, I think degenerate regexes like *a* are potentially costly.
Actually something like *Ⱗ* is probably worse since yeah it would need
to scan the entire FST (which probably has some a's in it?)
I don't see any way around that aside from: (1) telling user don't do
that, or (2) putting some accoun
Welcome Armin!
On Fri, Jul 26, 2024 at 7:24 PM Greg Miller wrote:
>
> Welcome Armin!
>
> On Fri, Jul 26, 2024 at 10:51 AM Patrick Zhai wrote:
>>
>> Congrats and welcome, Armin!
>>
>>
>> On Fri, Jul 26, 2024, 10:30 Vigya Sharma wrote:
>>>
>>> Congratulations and welcome, Armin! Volunteering as a
ah that helps, thanks
On Tue, Jul 2, 2024 at 2:41 PM Robert Muir wrote:
>
> On Tue, Jul 2, 2024 at 1:59 PM Michael Sokolov wrote:
> >
> > Hi all - I wonder if anyone else is observing weird email behavior
> > from Github. I'm starting to see emails generated fro
Hi all - I wonder if anyone else is observing weird email behavior
from Github. I'm starting to see emails generated from PRs and issues
that are wildly out of date. Like one dated yesterday that was
generated from a comment that is weeks old. And I am missing many
current updates -- as if there is
SUCCESS! [0:55:48.190137]
(tested w/Corretto JDK)
+1
On Mon, Jun 24, 2024 at 8:01 AM Benjamin Trent wrote:
>
> SUCCESS! [0:40:46.898514]
>
> +1
>
> On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera wrote:
> >
> > Please vote for release candidate 1 for Lucene 9.11.1
> >
> >
> > The artifacts can be
Thanks for digging into this Dawid - I think it's important to keep an
IDE dev path pretty clear of underbrush in order to encourage new
joiners, even if it is not the primary or best means of building and
testing
On Thu, Jun 13, 2024 at 2:01 PM Dawid Weiss wrote:
>
>
> Hi Mike,
>
> Just FYI - I
then re-scan to do the actual quantization?
>
> I am not sure what you mean here by "merge the float vectors". If you
> mean simply reading the individual float vector files and combining
> them into a single file, we already do that separately from
> quantizing.
>
>
Hi folks. I've been experimenting with our new scalar quantization
support - yay, thanks for adding it! I'm finding that when I index a
large number of large vectors, enabling quantization (vs simply
indexing the full-width floats) requires more heap - I keep getting
OOMs and have to increase heap
If I set IJ build/test to "gradle" and then right click on "core" in
the Project tab -- it gives an option like "run tests in
lucene-root.lucene.core" which works. At the very top (lucene
[lucene-root]) of the hierarchy you can right-click and select "run
all tests", but this fails with "Error runn
>
> Yet I feel certain I have been able to run all tests in IJ before.
>
>
>
> I don't think this was ever the case with intellij. Or maybe you ran those
> tests via gradle?
When I say "run in IJ" I mean I right clicked a button somewhere and said
"run all tests" :) I expect it was with the gradl
OK, I can see how the directory structure might be at odds
w/intellij's view of the world.Yet I feel certain I have been able to
run all tests in IJ before.
Just to disconfirm my insanity I tried again building and running all
tests in core on branch_9x/main using both intellij and gradle
build/te
hould work.
>
> Running via gradle is slow for me not just with Lucene but also with other
> projects... I can take a look but I'm pessimistic I can do any wonders here.
>
> Dawid
>
> On Fri, Jun 7, 2024 at 6:06 PM Michael Sokolov wrote:
>>
>
ule permissions thing
controlling the visibility of these symbols?
On Fri, Jun 7, 2024 at 11:53 AM Michael Sokolov wrote:
>
> hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's
> in a "test sources root" folder and won't allow itself to be set as
ssing. This
can't be this hard!
On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov wrote:
>
> hmm so after playing around with this Intellij build for a bit I ran
> into some trouble -- all the tests relying on SPI seemed to start
> failing. So then I switched back to build with G
n 7, 2024 at 10:40 AM Michael Sokolov wrote:
>
> ok, life must be scary for developers on windows!
>
> On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss wrote:
> >
> >
> > Certain regenerate tasks do require perl and python indeed.
> >
> > On Fri, Jun 7, 2024 a
ok, life must be scary for developers on windows!
On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss wrote:
>
>
> Certain regenerate tasks do require perl and python indeed.
>
> On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov wrote:
>>
>> While editing this CONTRIBUTI
While editing this CONTRIBUTING.md I found the following statement:
Some build tasks (in particular `./gradlew check`) require Perl
and Python 3.
Is it actually true that we require Perl?
On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov wrote:
>
> So I'm glad we have a fix for thi
me problem and it seems better now. Thank you, Dawid!
>
> On Thu, 6 Jun 2024 at 12:20, Michael Sokolov wrote:
>>
>> Oh! TIL! so much better, thanks. And now I have the "Repeat" option
>> back in the test runner
>>
>> On Thu, Jun 6, 2024 at
ly. Switch it to compile and run using its
> own built-in method - much faster.
>
>
>
> Dawid
>
> On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov wrote:
>>
>> Hi, I wonder how many of us are using intellij to run Lucene tests, and if
>> you are, have you notic
Neat!
On Thu, Jun 6, 2024, 2:57 AM Balog Tamás
wrote:
> Dear Lucene Community,
> Since Tuesday, the IntelliJ plugin called [Lucas](
> https://plugins.jetbrains.com/plugin/24567-lucas) is available on the
> JetBrains Marketplace.
>
> It integrates / ports the Luke toolbox to the IntelliJ Platform
Hi, I wonder how many of us are using intellij to run Lucene tests, and if
you are, have you noticed it having gotten really quite slow? It seems to
take a long time doing... Something... Before the test starts running. I
have a suspicion that we are using gradle in a way that forces it to
rebuild
+1
(tested w/Amazon Corretto JVM)
SUCCESS! [0:46:40.066524]
On Mon, Jun 3, 2024 at 7:30 AM Benjamin Trent wrote:
>
> Please vote for release candidate 1 for Lucene 9.11.0
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3
I misread this as "Lucene 911" as in "Lucene Emergency!!!" -- might
not land for everyone - someday we will Have Lucene 11.2? But ... no
concerns from me aside from the things you mentioned - thanks for
pushing, Ben
On Tue, May 28, 2024 at 9:58 AM Benjamin Trent wrote:
>
> Hey y'all,
>
> I am pla
I'm pretty sure it's only in core that we follow the no dependencies rule.
On Sat, May 18, 2024, 11:25 AM Bruno Roustant
wrote:
> The facet module has a dependency on com.carrotsearch:hppc.
>
> Is it possible to add the same dependency to the join module ? What is the
> rule ?
>
> Thanks
>
> Bru
We use it Amazon. I can't really read it so I'm not sure, but I think
it's used to encode terms that come up that aren't handled well by the
standard dictionary.
On Sat, May 18, 2024 at 8:39 AM Bruno Roustant wrote:
>
> Hi,
>
> While looking at the various usages of Map with Integer keys, I found
th a code
> > search).
> > We also always merge down to one segment (historical but also we index
> > once and then there are no changes for a week to a month and then we
> > reindex every document from scratch).
> >
> > Your response is very helpful already and
It seems as if the term frequency for some term exceeded the maximum.
This can happen if you supplied custom term frequencies eg with
https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/analysis/tokenattributes/TermFrequencyAttribute.html?is-external=true
. The behavior didn't change since
I also found this helpful documentation by looking in the source code
of SearchFiles.java: https://lucene.apache.org/core/9_10_0/demo/
On Mon, Apr 22, 2024 at 4:40 AM Stefan Vodita wrote:
>
> Hi Siddharth,
>
> If you happen to be using IntelliJ, you can run a demo class from the IDE.
> It probabl
Thanks for the explanation. It makes sense that we start with a given
seed and then each iteration is different because it re-uses the same
Random instance (or whatever static state?) without re-initialization?
On Wed, Apr 3, 2024 at 6:09 PM Dawid Weiss wrote:
>
>
>> Now I just need to understand
t; <https://github.com/apache/lucene/blob/main/gradle/testing/beasting.gradle#L62-L66>
>> in beasting.gradle
>> <https://github.com/apache/lucene/blob/main/gradle/testing/beasting.gradle>
>> .
>>
>> - Shubham
>>
>> On Wed, Apr 3, 2024 at 1:49 AM Mi
14 PM Michael Sokolov wrote:
>
> Is there a convenient way to run a test multiple times with different
> seeds? Do I need to write my own script? I feel like I used to be able
> to do this in IntelliJ, but that option seems to have vanished, and I
> don't see any such option in
Is there a convenient way to run a test multiple times with different
seeds? Do I need to write my own script? I feel like I used to be able
to do this in IntelliJ, but that option seems to have vanished, and I
don't see any such option in gradle testOpts either. I tried
-tests.iter but that seems
This TestBooleanMinShouldMatch.testRandomQueries failure did not
reproduce for me on branch_9x, with JDK 11 or JDK 17 or JDK 21. I ran
it a few times.
TestByteVectorSimilarityQuery.testSomeDeletes reproduces reliably -
I'll see if I can find out why it's unstable
On Mon, Apr 1, 2024 at 9:50 AM Po
This TestBooleanMinShouldMatch.testRandomQueries failure did not
reproduce for me on branch_9x, with JDK 11 or JDK 17 or JDK 21. I ran
it a few times.
TestByteVectorSimilarityQuery.testSomeDeletes reproduces reliably -
I'll see if I can find out why it's unstable
On Mon, Apr 1, 2024 at 9:50 AM Po
I guess it depends on what the problem with the project is. It seems
implicit in your ideas that the project has lost momentum; nobody is
contributing to it or maintaining it actively? But I just want to
point out there can be other problems that might need correction with
different solutions (too
timing makes sense to me. +1 for having a deadline to reduce
procrastination, but Adrien I don't honestly believe anyone who is
paying attention thinks that is what you have been doing!
On Wed, Mar 13, 2024 at 10:40 AM Adrien Grand wrote:
>
> Hello everyone!
>
> It's been ~2.5 years since we rele
Chrome on a Macbook, it's super dark. I can make
> it out but I gotta stare for a bit ... do they make light and dark mode
> .ico files in one!?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sun, Feb 25, 2024 at 6:05 PM Michael Sokolov
> wrote:
&
Welcome and congratulations, Chao!
On Sat, Feb 24, 2024 at 8:51 PM Christian Moen wrote:
>
> Congrats, Chao!
>
> On Wed, Feb 21, 2024 at 2:28 AM Adrien Grand wrote:
>>
>> I'm pleased to announce that Zhang Chao has accepted the PMC's
>> invitation to become a committer.
>>
>> Chao, the tradition
+1
On Fri, Feb 23, 2024 at 7:08 PM Stefan Vodita wrote:
>
> +1
>
> On Fri, 23 Feb 2024 at 11:24, Chris Hegarty
> wrote:
>>
>> Hi,
>>
>> Since the discussion on bumping the Lucene main branch to Java 21 is winding
>> down, let's hold a vote on this important change.
>>
>> Once bumped, the next
here is a favicon you might want to try: I cropped the "VL" from the
Apache Lucene logo (ok I guess it's an AL) -- if you save it as
favicon.ico in the root of your website (ie as url /favicon.ico) it
should show up in bookmarks, browser toolbars, etc as a handy memory
aid. Of course you might have
1 - 100 of 1071 matches
Mail list logo