Re: [VOTE] Release Lucene 9.12.3 RC2

2025-09-24 Thread Michael Sokolov
SUCCESS! [1:01:55.136848] +1 On Wed, Sep 24, 2025 at 12:14 PM Ankit Jain wrote: > > Please vote for release candidate 2 for Lucene 9.12.3 > > > The artifacts can be downloaded from: > > https://dist.apache.org/repos/dist/dev/lucene/lucene-9.12.3-RC2-rev-f965e930673c1f5cb478dc6a8907f5fc4ef7b539 >

Re: Inconsistent randomisation behaviour

2025-09-24 Thread Michael Sokolov
did you also set the other test parameters like -Ptests.nightly=true ? On Wed, Sep 24, 2025 at 7:01 AM Alan Woodward wrote: > > Hi all, > > I’m seeing some strange behaviour trying to reproduce a test failure in > IntelliJ, compared to what happens when I run the test on the command line > via

Re: [VOTE] Release Lucene 9.12.3 RC1

2025-09-22 Thread Michael Sokolov
Thanks for explaining, Uwe. I noticed tests did not run, but I didn't see the error you found - I wonder, does smoketester print it out for you, or did you find the error in a log somewhere? On Mon, Sep 22, 2025 at 5:19 AM Uwe Schindler wrote: > > Hi, > > First of all, it passed for me on Policem

Re: Welcome Trevor McCulloch as Lucene committer

2025-09-19 Thread Michael Sokolov
Hello Trevor! On Fri, Sep 19, 2025, 12:31 AM Vigya Sharma wrote: > Congratulations Trevor! Welcome! > > On Thu, Sep 18, 2025 at 6:45 PM Ankit Jain wrote: > >> Congratulations Trevor! >> >> >> On Thu, Sep 18, 2025 at 6:36 PM 张超 wrote: >> >>> Congratulations and welcome, Trevor! >>> >>> On Sep 1

Re: Welcome Ramakrishna Chilaka as Lucene committer

2025-09-18 Thread Michael Sokolov
Felicitations and Salutations, Ramakrishna! On Thu, Sep 18, 2025 at 2:27 PM Michael Froh wrote: > > Congratulations Ramakrishna! Welcome aboard! > > Froh > > On Thu, Sep 18, 2025 at 7:31 AM Anshum Gupta wrote: >> >> Congratulations and welcome, Ramakrishna! >> >> On Wed, Sep 17, 2025 at 12:03 PM

Re: Proposing a Lucene 9.12.3

2025-09-18 Thread Michael Sokolov
/repos/dist/release/lucene/KEYS. Hopefully >> that should avoid any key related issues. >> Will also get my key signed by one of the committers to prevent tests from >> complaining. I am assuming that the gpg public key block does not change >> after getting it signed. >

Re: Proposing a Lucene 9.12.3

2025-09-16 Thread Michael Sokolov
Have you gotten a code signing gpg key? That usually seems to cause some delays since the key must be committed to a file in order for the release to go through On Tue, Sep 16, 2025, 3:17 PM Ankit Jain wrote: > Thanks Benjamin and Chris. Will kick off the release process later today > or tomorro

Re: [VOTE] Release Lucene 10.3.0 RC2

2025-09-09 Thread Michael Sokolov
+1 SUCCESS! [1:08:41.293177] On Tue, Sep 9, 2025 at 1:11 PM Ankit Jain wrote: > > +1 (non-binding) > > SUCCESS! [1:17:16.300605] > > > On Tue, Sep 9, 2025 at 9:33 AM Kaival Parikh wrote: >> >> Trying this for the first time! >> >> +1 (non-binding) >> >> SUCCESS! [0:37:52.845297] >> >> - Kaival

Re: [VOTE] Release Lucene 10.3.0 RC1

2025-09-05 Thread Michael Sokolov
oh I guess we need to actually create the thing, sorry I pulled the latest changes and ran the smoketester, but I think that actually just tested RC1, duh. I'll rescind my +1, and -1 for RC1 On Fri, Sep 5, 2025 at 5:53 PM Michael Sokolov wrote: > > SUCCESS! [1:08:55.634042] > &g

Re: [VOTE] Release Lucene 10.3.0 RC1

2025-09-05 Thread Michael Sokolov
e should bump the upper bound JDK version check in our vectorization >> provider. >> >> The changes are trivial, but do we want to do a new RC for this, since JDK >> 25 is due Sep 16th. >> >> https://github.com/apache/lucene/pull/15157 - work in progress, but I

Re: [VOTE] Release Lucene 10.3.0 RC1

2025-09-04 Thread Michael Sokolov
he.org syncs). >>> >>> On Thu, Sep 4, 2025 at 6:22 PM Adrien Grand wrote: >>>> >>>> I'm running with Python 3.12.1 and don't get this error. >>>> >>>> However GPG complains about the signature. ("gpg: Can't check s

Re: [VOTE] Release Lucene 10.3.0 RC1

2025-09-04 Thread Michael Sokolov
does the smokeTester require fancy new python now? It seems to fail for me with python 3.9: sokolovm@sok➜~/workspace/lucene(branch_10_3)» python3 -u dev-tools/scripts/smokeTestRelease.py \ [14:56:23] https://dist.apache.org/repos/dist/dev/lucene/lucene-10.3.0-RC1-rev-878a3db9c2d029020b0fcb2b

Re: Integer.MAX_VALUE sentinel never checked?

2025-09-01 Thread Michael Sokolov
What could be clearer evidence for Lucene's 1200 year cycle of rebirth and renewal? On Mon, Sep 1, 2025, 3:01 AM Dawid Weiss wrote: > Some message from the future? 3203/12/01 or does it come from the far >> past? At least no longer relevant, we left TermEnum and TermPositions >> classes beh

Re: called on the wrong instance

2025-08-14 Thread Michael Sokolov
oh! thanks - my ability to search for strings in files is not as good as I thought! On Thu, Aug 14, 2025 at 5:39 PM Chris Hostetter wrote: > > > https://github.com/apache/lucene/pull/15044 > > : Date: Thu, 14 Aug 2025 20:14:07 + > : From: Mike Sokolov > : Reply-To: dev@lucene.apache.org > :

Re: Proposing a 10.3 release in September

2025-08-12 Thread Michael Sokolov
OK, 10.3 release as an on-stage performance, Vigya! On Tue, Aug 12, 2025 at 7:15 AM Michael McCandless wrote: > > +1 for 10.3 release. Note that Community over Code NA is Sep 11 - Sep 14 in > Minneapolis so some of us might be less responsive ... but I don't think > that's a reason to alter th

Re: Proposing a 10.3 release in September

2025-08-08 Thread Michael Sokolov
yes, thanks! Looks like it will be a nice release On Fri, Aug 8, 2025 at 7:22 AM Adrien Grand wrote: > > +1 Thank you Chris! > > Adrien > > Le ven. 8 août 2025, 13:19, Chris Hegarty > a écrit : >> >> Hi everyone, >> >> branch_10x is looking great - we’ve got a bunch of solid improvements and >

Re: IOContext, ReadAdvice, madvise

2025-08-08 Thread Michael Sokolov
e up to each directory implementation to look at the hints > >> specified, and use those to inform how it should open the files. At the > >> moment, MMapDirectory is the only one which does this, and it does this > >> using different ReadAdvices based on the hints. Exactly w

IOContext, ReadAdvice, madvise

2025-08-07 Thread Michael Sokolov
I want to raise an issue here that has come up before which is about the choices we have made to apply madvise flags in an opinionated way. In our environment, the choices Lucene is making are really detrimental to our indexing throughput. In the past we had disabled this by subclassing MMapDir

Re: Welcome Vigya Sharma to the Lucene PMC

2025-08-04 Thread Michael Sokolov
Welcome to PMC, Vigya! On Mon, Aug 4, 2025 at 5:26 AM Uwe Schindler wrote: > > Welcome and congrats Vigya Sharma! > > Uwe > > Am 02.08.2025 um 18:20 schrieb Michael McCandless: > > Hello Lucene developers, > > I'm happy to announce that Vigya Sharma has accepted an invitation to join > the Lucen

Re: Welcome Pan Guixin as Lucene committer

2025-07-21 Thread Michael Sokolov
congratulations and welcome! Mike On Mon, Jul 21, 2025 at 6:13 AM Uwe Schindler wrote: > > Hi, > > Congrats and welcome! > > Uwe > > Am 19.07.2025 um 02:47 schrieb Adrien Grand: > > Hello everyone, > > I'm pleased to announce that Pan Guixin has accepted the PMC's invitation to > become a commi

Re: lucene/core/src/java24

2025-07-09 Thread Michael Sokolov
t; Just some additional info: From what I figured out, cherry-picking to > stable branch is possible, although the sourceset and directories have > different names, becazuse git is intelligent enough to recognize this > like a file rename. > > Uwe > > Am 08.07.2025 um 22:00 sc

Re: lucene/core/src/java24

2025-07-08 Thread Michael Sokolov
ava versions. Therefore the > compilation is done using a stub: > https://github.com/apache/lucene/tree/main/lucene/core/src/generated/jdk > > In addition when Panama Vector enters Preview mode (should happen hopefully > soon after Java 25) we need a separate sourceset anyways. &g

lucene/core/src/java24

2025-07-08 Thread Michael Sokolov
I'm curious why we have lucene/core/src/java24 directory/module on main branch instead of moving these classes to lucene/core/src/java, now that JDK24 is mandatory.

Re: Welcome Simon Cooper as Lucene committer

2025-06-20 Thread Michael Sokolov
Welcome, Simon! On Fri, Jun 20, 2025 at 9:32 AM Mikhail Khludnev wrote: > > Welcome Simon! > > On Fri, Jun 20, 2025 at 3:52 PM Simon Cooper > wrote: >> >> Hi everyone, >> >> Many thanks for the invite! I'm delighted to accept, and look forward to >> future Lucene development. >> >> I live in C

Re: Do we know why Lucene's HNSW may be slower than other HNSW implementations?

2025-06-19 Thread Michael Sokolov
We've recently been comparing Lucene's HNSW w/FAISS' and there is not a 2x difference there. FAISS does seem to be around 10-15% faster I think? The 2x difference is roughly what I was seeing in comparisons w/hnswlib prior to the dot-product improvements we made in Lucene. On Thu, Jun 19, 2025 at

Re: Proposing 10.2.2 and 9.12.2 releases

2025-06-11 Thread Michael Sokolov
I would recommend including this commit https://github.com/apache/lucene/commit/86091d379820f65eec0e0f57f075b665d843f31a (sorry, no PR!) which is a small test fix needed along with removing HNSW connectedCOmponents On Wed, Jun 11, 2025 at 12:33 PM Chris Hegarty wrote: > > Github milestones are no

Re: Welcome Ankit Jain as Lucene committer

2025-05-07 Thread Michael Sokolov
Welcome and congratulations, Ankit! On Wed, May 7, 2025, 1:12 AM Michael Froh wrote: > Congratulations Ankit! This is fantastic! > > On Tue, May 6, 2025 at 3:36 PM Alan Woodward wrote: > >> Welcome Ankit! >> >> >> On 6 May 2025, at 20:05, Ankit Jain wrote: >> >> Thanks everyone for the warm we

Re: Welcome Stefan Vodita to the Lucene PMC

2025-05-05 Thread Michael Sokolov
Welcome, Stefan! On Mon, May 5, 2025 at 1:36 AM Adrien Grand wrote: > > Congratulations and welcome, Stefan! > > > Le lun. 5 mai 2025, 01:24, Vigya Sharma a écrit : >> >> Congratulations Stefan! >> >> On Sun, May 4, 2025 at 3:31 PM Andriy Redko wrote: >>> >>> Congrats Stefan!!! >>> >>> On Sun,

Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-25-ea+15) - Build # 54509 - Unstable!

2025-04-14 Thread Michael Sokolov
" to a >> larger value, because Jenkins got much faster, so builds are recycled >> faster. >> >> Uwe >> >> Am 14.04.2025 um 19:31 schrieb Michael Sokolov: >> > Hi Uwe, I just want to see the logs output by failed tests in Jenkins; >> > I have no need

Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-25-ea+15) - Build # 54509 - Unstable!

2025-04-14 Thread Michael Sokolov
oon, just a bit busy. > > Uwe > > Am 14.04.2025 um 13:43 schrieb Michael Sokolov: > > I tried to investigate, but don't have (lost?) access to Policeman > > Jenkins - I guess it was rebuilt recently? Is there a way to extend > > access to Lucene committers at

Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-25-ea+15) - Build # 54509 - Unstable!

2025-04-14 Thread Michael Sokolov
I tried to investigate, but don't have (lost?) access to Policeman Jenkins - I guess it was rebuilt recently? Is there a way to extend access to Lucene committers at least? On Sat, Apr 12, 2025 at 9:01 PM Policeman Jenkins Server wrote: > > Build: https://jenkins.thetaphi.de/job/Lucene-main-Linux

Re: Proposing a 10.2.0 release

2025-04-03 Thread Michael Sokolov
It makes sense to me. I think it's providing marginal benefits, and the downside is bad On Thu, Apr 3, 2025, 4:58 AM Benjamin Trent wrote: > Hey y'all, > > Unless there is strong dissenting opinion, I think we should revert the > connected components work in HNSW for 10.2 as a bug fix. > https:/

Re: Exploring Telemetry for Lucene

2025-03-19 Thread Michael Sokolov
One idea I've heard batted around is to override IndexSearcher.createWeight in a profiling IndexSearcher and then wrap Weights and finally Scorers in order to emit metrics on every advance or other low-level operation. This could help with search profiling I guess although of course it will slow e

Re: Policeman Jenkins => new hardware

2025-03-19 Thread Michael Sokolov
Woohoo! thanks Uwe; exciting you were able to get 2x the lifespan of the drives. Let's go for 4x this time! On Tue, Mar 18, 2025 at 12:53 PM Uwe Schindler wrote: > > Moin moin, > > Policeman Jenkins got new hardware yesterday - no functional changes. > > Background: The old server had some stran

Re: Welcome Michael Froh as Lucene committer

2025-03-06 Thread Michael Sokolov
Wilkommen, bienvenue, welcome, Froh! On Thu, Mar 6, 2025, 4:29 AM Chris Hegarty wrote: > Congrats and welcome, Michael! > > -Chris > > > On 6 Mar 2025, at 08:05, Dawid Weiss wrote: > > > > > > Hello everyone, > > > > I'm pleased to announce that Michael Froh has accepted the PMC's > invitation

issues mailing list

2025-01-23 Thread Michael Sokolov
I recently unsubscribed from issues@ because it seemed to me it was duplicative with the emails I was already receiving from github. Is it true that they have the same contents (with different formatting)? If so, should we drop issues@? Or maybe it remains useful for users lacking github account

Re: [JENKINS] Lucene » Lucene-NightlyTests-main - Build # 1535 - Unstable!

2025-01-08 Thread Michael Sokolov
this one did not reproduce for me on main just now On Wed, Jan 8, 2025 at 4:36 AM Apache Jenkins Server wrote: > > Build: > https://ci-builds.apache.org/job/Lucene/job/Lucene-NightlyTests-main/1535/ > > 3 tests failed. > FAILED: org.apache.lucene.misc.index.TestBpVectorReorderer.testQuantizedIn

policeman build emails lack repro line

2025-01-08 Thread Michael Sokolov
I noticed that at least some of the build failure emails coming from policeman do not include the "reproduce with" command line. While we can dig it out by clicking through to jenkins, it would be nice if it were directly accessible in the email too (as it is on the ci build emails). Is that possi

Re: [VOTE] Release Lucene 10.1.0 RC1

2024-12-16 Thread Michael Sokolov
SUCCESS! [1:54:16.305613] +1 On Mon, Dec 16, 2024 at 4:32 AM Ignacio Vera wrote: > > SUCCESS! [1:16:40.906176] > > +1 > > On Mon, Dec 16, 2024 at 9:05 AM Luca Cavanna wrote: > > > > Ok, thanks for clarifying Jim, that means we go ahead with th RC1 vote, > > unless there are objections. > > > >

Re: Off-heap binary doc values

2024-12-05 Thread Michael Sokolov
That makes sense to me too in the abstract. At Amazon we also have interesting BDV fields we have to decode on the fly, so this looks attractive for that reason (not just faceting). I would say though that it would be easier to evaluate the fitness for purpose (faceting) if we had some examples of

Re: Jenkins issues

2024-10-15 Thread Michael Sokolov
Maybe we would want to have commit-triggered builds as well and those could go in a common pool? We actually do that for PRs with GitHub now and it seems to work out okay, but it's not exactly like having a build for each commit to main On Tue, Oct 15, 2024, 8:02 AM Jason Gerlowski wrote: > > J

Re: [VOTE] Release Lucene 10.0.0 RC4

2024-10-10 Thread Michael Sokolov
+1 SUCCESS! [1:25:19.133719] On Thu, Oct 10, 2024 at 9:42 AM Stefan Vodita wrote: > > +1 SUCCESS! [0:35:35.851448] > > We didn't mention the new aggregation engine [1] in the release notes for 9.12 > and that seems like a missed opportunity. It's a significant change, which > improves over the

Re: [VOTE] Release Lucene 10.0.0 RC2

2024-10-03 Thread Michael Sokolov
SUCCESS! [1:24:53.393070] On Thu, Oct 3, 2024 at 9:43 AM Benjamin Trent wrote: > > +1 SUCCESS! [0:56:38.403983] > > On Thu, Oct 3, 2024 at 5:51 AM Stefan Vodita wrote: > > > > +1 SUCCESS! [0:39:04.597088] > > > > > > On Thu, 3 Oct 2024 at 07:48, Luca Cavanna wrote: > >> > >> Please vote for rel

Re: [VOTE] Release Lucene 10.0.0 RC2

2024-10-03 Thread Michael Sokolov
um, +1 On Thu, Oct 3, 2024 at 10:39 AM Michael Sokolov wrote: > > SUCCESS! [1:24:53.393070] > > On Thu, Oct 3, 2024 at 9:43 AM Benjamin Trent wrote: > > > > +1 SUCCESS! [0:56:38.403983] > > > > On Thu, Oct 3, 2024 at 5:51 AM Stefan Vodita > > wrot

Re: Preparation and branching for Lucene 10

2024-09-30 Thread Michael Sokolov
Thanks, done On Mon, Sep 30, 2024 at 10:39 AM Chris Hegarty wrote: > > Hi Mike, > > > On 30 Sep 2024, at 15:35, Michael Sokolov wrote: > > > > Chris - I wonder if it would be OK to cherry-pick > > f2b2bfc414873558bf8a18be3c40fe67939dd25e to branch_10_0? It is a

Re: Preparation and branching for Lucene 10

2024-09-30 Thread Michael Sokolov
Chris - I wonder if it would be OK to cherry-pick f2b2bfc414873558bf8a18be3c40fe67939dd25e to branch_10_0? It is a doc-only update referring to a change that is on that branch. On Mon, Sep 30, 2024 at 6:47 AM Chris Hegarty wrote: > > Hi, > > In preparation for the upcoming Lucene 10 release: > >

Re: I Have a Question in 'BaseTokenStreamTestCase.java'

2024-09-29 Thread Michael Sokolov
Lucene's test framework makes heavy use of randomization in order to explore more of the vast space of possible states. You might be familiar with this as "fuzz testing"? There's a blog post about it here (from 2011!) https://blog.mikemccandless.com/2011/03/your-test-cases-should-sometimes-fail.htm

Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-24-ea+16) - Build # 50364 - Unstable!

2024-09-28 Thread Michael Sokolov
retrieved. On Sat, Sep 28, 2024 at 5:15 PM Michael Sokolov wrote: > > These failures relate to the way Arrays.binarySearch works when there > are repeated values, in which case the result is undefined (it can be > any of the indexes with the value), but in SlowCompositeReaderWrapper >

Re: [JENKINS-EA] Lucene-main-Linux (64bit/hotspot/jdk-24-ea+16) - Build # 50364 - Unstable!

2024-09-28 Thread Michael Sokolov
These failures relate to the way Arrays.binarySearch works when there are repeated values, in which case the result is undefined (it can be any of the indexes with the value), but in SlowCompositeReaderWrapper we are relying on finding the lowest-indexed of the repeats. I'll work on a fix On Sat,

Re: [VOTE] Release Lucene 9.12.0 RC2

2024-09-27 Thread Michael Sokolov
SUCCESS! [0:59:38.025612] +1 On Fri, Sep 27, 2024 at 6:01 AM 张超 wrote: > > +1 > > SUCCESS! [1:11:28.608954] > > 2024年9月27日 04:35,Dawid Weiss 写道: > > > SUCCESS! [3:04:58.589040] > > +1. Thank you, Chris. > > On Wed, Sep 25, 2024 at 7:02 PM Chris Hegarty > wrote: >> >> Please vote for release c

Re: Feature freeze for Lucene 9.12 and Lucene 10.0

2024-09-13 Thread Michael Sokolov
Hi Adrien, I thought we had another week? I looked back at Old emails and thought you had targeted SEP 22 for feature freeze? On Fri, Sep 13, 2024, 7:45 AM Adrien Grand wrote: > Hello everyone, > > As previously discussed, I plan on feature freezing Lucene 9.12 and Lucene > 10.0 next week. Prac

Re: Lucene 10.0 and 9.12 blockers

2024-09-09 Thread Michael Sokolov
Hi, I've been looking into Adrien's suggestion to migrate (Byte/Float)VectorValues to an unabashedly random-access API. We can easily enough support iteration on top of that (which we use extensively during indexing). I think this would represent a great simplification; preliminary implementation s

Re: Baffling performance regression measured by luceneutil

2024-08-16 Thread Michael Sokolov
Maybe getSlices has some side effect that messes up create Weight? On Fri, Aug 16, 2024, 7:10 AM Michael Sokolov wrote: > That is super weird. I wonder if changing the names of variables will make > a difference. Have you verified that this effect is observable during all > lunar phas

Re: Baffling performance regression measured by luceneutil

2024-08-16 Thread Michael Sokolov
That is super weird. I wonder if changing the names of variables will make a difference. Have you verified that this effect is observable during all lunar phases? I assume we liked at any profiler do offs we could get our hands on? If not, maybe some for would show up there. On Thu, Aug 15, 2024,

Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-06 Thread Michael Sokolov
( TooComplexToDeterminizeException.class, () -> { new RegexpQuery(new Term("stringvalue", "(.*a){2000}")); }); } On Tue, Aug 6, 2024 at 10:56 AM Michael Sokolov wrote: > > Yes, I think degenerate regexes like *a* are potentially costly. > Actually some

Re: AbstractMultiTermQueryConstantScoreWrapper cost estimates (https://github.com/apache/lucene/issues/13029)

2024-08-06 Thread Michael Sokolov
Yes, I think degenerate regexes like *a* are potentially costly. Actually something like *Ⱗ* is probably worse since yeah it would need to scan the entire FST (which probably has some a's in it?) I don't see any way around that aside from: (1) telling user don't do that, or (2) putting some accoun

Re: Welcome Armin Braun as Lucene comitter

2024-07-27 Thread Michael Sokolov
Welcome Armin! On Fri, Jul 26, 2024 at 7:24 PM Greg Miller wrote: > > Welcome Armin! > > On Fri, Jul 26, 2024 at 10:51 AM Patrick Zhai wrote: >> >> Congrats and welcome, Armin! >> >> >> On Fri, Jul 26, 2024, 10:30 Vigya Sharma wrote: >>> >>> Congratulations and welcome, Armin! Volunteering as a

Re: github notification delay

2024-07-02 Thread Michael Sokolov
ah that helps, thanks On Tue, Jul 2, 2024 at 2:41 PM Robert Muir wrote: > > On Tue, Jul 2, 2024 at 1:59 PM Michael Sokolov wrote: > > > > Hi all - I wonder if anyone else is observing weird email behavior > > from Github. I'm starting to see emails generated fro

github notification delay

2024-07-02 Thread Michael Sokolov
Hi all - I wonder if anyone else is observing weird email behavior from Github. I'm starting to see emails generated from PRs and issues that are wildly out of date. Like one dated yesterday that was generated from a comment that is weeks old. And I am missing many current updates -- as if there is

Re: [VOTE] Release Lucene 9.11.1 RC1

2024-06-24 Thread Michael Sokolov
SUCCESS! [0:55:48.190137] (tested w/Corretto JDK) +1 On Mon, Jun 24, 2024 at 8:01 AM Benjamin Trent wrote: > > SUCCESS! [0:40:46.898514] > > +1 > > On Mon, Jun 24, 2024 at 1:29 AM Ignacio Vera wrote: > > > > Please vote for release candidate 1 for Lucene 9.11.1 > > > > > > The artifacts can be

Re: Intellij build/test times

2024-06-13 Thread Michael Sokolov
Thanks for digging into this Dawid - I think it's important to keep an IDE dev path pretty clear of underbrush in order to encourage new joiners, even if it is not the primary or best means of building and testing On Thu, Jun 13, 2024 at 2:01 PM Dawid Weiss wrote: > > > Hi Mike, > > Just FYI - I

Re: scalar quantization heap usage during merge

2024-06-12 Thread Michael Sokolov
then re-scan to do the actual quantization? > > I am not sure what you mean here by "merge the float vectors". If you > mean simply reading the individual float vector files and combining > them into a single file, we already do that separately from > quantizing. > >

scalar quantization heap usage during merge

2024-06-12 Thread Michael Sokolov
Hi folks. I've been experimenting with our new scalar quantization support - yay, thanks for adding it! I'm finding that when I index a large number of large vectors, enabling quantization (vs simply indexing the full-width floats) requires more heap - I keep getting OOMs and have to increase heap

Re: Intellij build/test times

2024-06-10 Thread Michael Sokolov
If I set IJ build/test to "gradle" and then right click on "core" in the Project tab -- it gives an option like "run tests in lucene-root.lucene.core" which works. At the very top (lucene [lucene-root]) of the hierarchy you can right-click and select "run all tests", but this fails with "Error runn

Re: Intellij build/test times

2024-06-10 Thread Michael Sokolov
> > Yet I feel certain I have been able to run all tests in IJ before. > > > > I don't think this was ever the case with intellij. Or maybe you ran those > tests via gradle? When I say "run in IJ" I mean I right clicked a button somewhere and said "run all tests" :) I expect it was with the gradl

Re: Intellij build/test times

2024-06-09 Thread Michael Sokolov
OK, I can see how the directory structure might be at odds w/intellij's view of the world.Yet I feel certain I have been able to run all tests in IJ before. Just to disconfirm my insanity I tried again building and running all tests in core on branch_9x/main using both intellij and gradle build/te

Re: Intellij build/test times

2024-06-08 Thread Michael Sokolov
hould work. > > Running via gradle is slow for me not just with Lucene but also with other > projects... I can take a look but I'm pessimistic I can do any wonders here. > > Dawid > > On Fri, Jun 7, 2024 at 6:06 PM Michael Sokolov wrote: >> >

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
ule permissions thing controlling the visibility of these symbols? On Fri, Jun 7, 2024 at 11:53 AM Michael Sokolov wrote: > > hm I found FakeCharFilterFactory in src/test/META-INF.services -- it's > in a "test sources root" folder and won't allow itself to be set as

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
ssing. This can't be this hard! On Fri, Jun 7, 2024 at 11:44 AM Michael Sokolov wrote: > > hmm so after playing around with this Intellij build for a bit I ran > into some trouble -- all the tests relying on SPI seemed to start > failing. So then I switched back to build with G

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
n 7, 2024 at 10:40 AM Michael Sokolov wrote: > > ok, life must be scary for developers on windows! > > On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss wrote: > > > > > > Certain regenerate tasks do require perl and python indeed. > > > > On Fri, Jun 7, 2024 a

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
ok, life must be scary for developers on windows! On Fri, Jun 7, 2024 at 10:33 AM Dawid Weiss wrote: > > > Certain regenerate tasks do require perl and python indeed. > > On Fri, Jun 7, 2024 at 2:23 PM Michael Sokolov wrote: >> >> While editing this CONTRIBUTI

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
While editing this CONTRIBUTING.md I found the following statement: Some build tasks (in particular `./gradlew check`) require Perl and Python 3. Is it actually true that we require Perl? On Fri, Jun 7, 2024 at 8:11 AM Michael Sokolov wrote: > > So I'm glad we have a fix for thi

Re: Intellij build/test times

2024-06-07 Thread Michael Sokolov
me problem and it seems better now. Thank you, Dawid! > > On Thu, 6 Jun 2024 at 12:20, Michael Sokolov wrote: >> >> Oh! TIL! so much better, thanks. And now I have the "Repeat" option >> back in the test runner >> >> On Thu, Jun 6, 2024 at

Re: Intellij build/test times

2024-06-06 Thread Michael Sokolov
ly. Switch it to compile and run using its > own built-in method - much faster. > > > > Dawid > > On Thu, Jun 6, 2024 at 12:10 PM Michael Sokolov wrote: >> >> Hi, I wonder how many of us are using intellij to run Lucene tests, and if >> you are, have you notic

Intellij build/test times

2024-06-06 Thread Michael Sokolov
Hi, I wonder how many of us are using intellij to run Lucene tests, and if you are, have you noticed it having gotten really quite slow? It seems to take a long time doing... Something... Before the test starts running. I have a suspicion that we are using gradle in a way that forces it to rebuild

Re: [VOTE] Release Lucene 9.11.0 RC1

2024-06-03 Thread Michael Sokolov
+1 (tested w/Amazon Corretto JVM) SUCCESS! [0:46:40.066524] On Mon, Jun 3, 2024 at 7:30 AM Benjamin Trent wrote: > > Please vote for release candidate 1 for Lucene 9.11.0 > > The artifacts can be downloaded from: > https://dist.apache.org/repos/dist/dev/lucene/lucene-9.11.0-RC1-rev-d433394b292e3

Re: Lucene 9.11

2024-05-28 Thread Michael Sokolov
I misread this as "Lucene 911" as in "Lucene Emergency!!!" -- might not land for everyone - someday we will Have Lucene 11.2? But ... no concerns from me aside from the things you mentioned - thanks for pushing, Ben On Tue, May 28, 2024 at 9:58 AM Benjamin Trent wrote: > > Hey y'all, > > I am pla

Re: Join module dependency

2024-05-19 Thread Michael Sokolov
I'm pretty sure it's only in core that we follow the no dependencies rule. On Sat, May 18, 2024, 11:25 AM Bruno Roustant wrote: > The facet module has a dependency on com.carrotsearch:hppc. > > Is it possible to add the same dependency to the join module ? What is the > rule ? > > Thanks > > Bru

Re: How much is ja.dict.UserDictionary used?

2024-05-18 Thread Michael Sokolov
We use it Amazon. I can't really read it so I'm not sure, but I think it's used to encode terms that come up that aren't handled well by the standard dictionary. On Sat, May 18, 2024 at 8:39 AM Bruno Roustant wrote: > > Hi, > > While looking at the various usages of Map with Integer keys, I found

Re: beasting tests

2024-04-04 Thread Michael Sokolov
Thanks for the explanation. It makes sense that we start with a given seed and then each iteration is different because it re-uses the same Random instance (or whatever static state?) without re-initialization? On Wed, Apr 3, 2024 at 6:09 PM Dawid Weiss wrote: > > >> Now I just need to understand

Re: beasting tests

2024-04-02 Thread Michael Sokolov
t; <https://github.com/apache/lucene/blob/main/gradle/testing/beasting.gradle#L62-L66> >> in beasting.gradle >> <https://github.com/apache/lucene/blob/main/gradle/testing/beasting.gradle> >> . >> >> - Shubham >> >> On Wed, Apr 3, 2024 at 1:49 AM Mi

Re: beasting tests

2024-04-02 Thread Michael Sokolov
14 PM Michael Sokolov wrote: > > Is there a convenient way to run a test multiple times with different > seeds? Do I need to write my own script? I feel like I used to be able > to do this in IntelliJ, but that option seems to have vanished, and I > don't see any such option in

beasting tests

2024-04-02 Thread Michael Sokolov
Is there a convenient way to run a test multiple times with different seeds? Do I need to write my own script? I feel like I used to be able to do this in IntelliJ, but that option seems to have vanished, and I don't see any such option in gradle testOpts either. I tried -tests.iter but that seems

Re: [JENKINS] Lucene-9.x-Linux (64bit/hotspot/jdk-17.0.9) - Build # 15969 - Unstable!

2024-04-01 Thread Michael Sokolov
This TestBooleanMinShouldMatch.testRandomQueries failure did not reproduce for me on branch_9x, with JDK 11 or JDK 17 or JDK 21. I ran it a few times. TestByteVectorSimilarityQuery.testSomeDeletes reproduces reliably - I'll see if I can find out why it's unstable On Mon, Apr 1, 2024 at 9:50 AM Po

Re: Lucene 10

2024-03-14 Thread Michael Sokolov
timing makes sense to me. +1 for having a deadline to reduce procrastination, but Adrien I don't honestly believe anyone who is paying attention thinks that is what you have been doing! On Wed, Mar 13, 2024 at 10:40 AM Adrien Grand wrote: > > Hello everyone! > > It's been ~2.5 years since we rele

Re: Announcing githubsearch!

2024-02-27 Thread Michael Sokolov
Chrome on a Macbook, it's super dark. I can make > it out but I gotta stare for a bit ... do they make light and dark mode > .ico files in one!? > > Mike McCandless > > http://blog.mikemccandless.com > > > On Sun, Feb 25, 2024 at 6:05 PM Michael Sokolov > wrote: &

Re: Welcome Zhang Chao as Lucene committer

2024-02-25 Thread Michael Sokolov
Welcome and congratulations, Chao! On Sat, Feb 24, 2024 at 8:51 PM Christian Moen wrote: > > Congrats, Chao! > > On Wed, Feb 21, 2024 at 2:28 AM Adrien Grand wrote: >> >> I'm pleased to announce that Zhang Chao has accepted the PMC's >> invitation to become a committer. >> >> Chao, the tradition

Re: [Vote] Bump the Lucene main branch to Java 21

2024-02-25 Thread Michael Sokolov
+1 On Fri, Feb 23, 2024 at 7:08 PM Stefan Vodita wrote: > > +1 > > On Fri, 23 Feb 2024 at 11:24, Chris Hegarty > wrote: >> >> Hi, >> >> Since the discussion on bumping the Lucene main branch to Java 21 is winding >> down, let's hold a vote on this important change. >> >> Once bumped, the next

Re: Announcing githubsearch!

2024-02-25 Thread Michael Sokolov
here is a favicon you might want to try: I cropped the "VL" from the Apache Lucene logo (ok I guess it's an AL) -- if you save it as favicon.ico in the root of your website (ie as url /favicon.ico) it should show up in bookmarks, browser toolbars, etc as a handy memory aid. Of course you might have

Re: Announcing githubsearch!

2024-02-20 Thread Michael Sokolov
I love the gray all text UI. Don't change it! But I wonder if it's time for a favicon? On Tue, Feb 20, 2024, 4:40 AM Adrien Grand wrote: > Very cool, thank you Mike! > > On Mon, Feb 19, 2024 at 5:40 PM Michael McCandless < > luc...@mikemccandless.com> wrote: > >> Hi Team, >> >> ~1.5 years ago (A

Re: Welcome Stefan Vodita as Lucene committter

2024-01-19 Thread Michael Sokolov
Hello Stefan, welcome! On Fri, Jan 19, 2024 at 10:41 AM Martin Gainty wrote: > Congratulations Stefan! > > I look forward to reading your posts > > ~martin > -- > *From:* Michael McCandless > *Sent:* Thursday, January 18, 2024 10:53 AM > *To:* dev@lucene.apache.org

Re: [VOTE] Release Lucene 9.9.1 RC1

2023-12-14 Thread Michael Sokolov
+1 SUCCESS! [0:50:50.776559] Note: we did get some test fails on the mailing list this morning, but I believe they are not real bugs and will be resolved by tightening up our test assumptions On Thu, Dec 14, 2023 at 7:08 AM Guo Feng wrote: > +1 > > SUCCESS! [3:38:43.833896] > > On 2023/12/14 1

Re: [VOTE] Release Lucene 9.9.0 RC2

2023-11-30 Thread Michael Sokolov
SUCCESS! [0:46:20.693134] +1 On Thu, Nov 30, 2023 at 5:50 PM Tomás Fernández Löbbe wrote: > SUCCESS! [0:52:49.337126] > > +1 > > On Thu, Nov 30, 2023 at 12:05 PM Benjamin Trent > wrote: > >> SUCCESS! [0:44:05.132154] >> >> +1 >> >> On Thu, Nov 30, 2023 at 1:09 PM Chris Hegarty >> wrote: >> >>

Re: [VOTE] Release Lucene 9.9.0 RC1

2023-11-30 Thread Michael Sokolov
for the sake of posterity, I did get a successful smoketest: SUCCESS! [1:00:06.512261] but +0 to release I guess since it's moot... On Thu, Nov 30, 2023 at 10:38 AM Michael McCandless < luc...@mikemccandless.com> wrote: > On Thu, Nov 30, 2023 at 9:56 AM Chris Hegarty > wrote: > > P.S. I’m less

Re: GDPR compliance

2023-11-29 Thread Michael Sokolov
Another way is to ensure that all documents get updated on a regular cadence whether there are changes in the underlying data or not. Or, regenerating the index from scratch all the time. Of course these approaches might be more costly for an index that has intrinsically low update rates, but they

Re: Lucene 9.9.0 Release

2023-11-22 Thread Michael Sokolov
+1 thanks for volunteering! Hijacking the thread a bit, sorry, I started looking into whether this is a good time to start looking ahead to 10? I know we had some rumblings about releasing that so we can start requiring newer JDKs. But looking at CHANGES it feels like we already back-ported most o

Re: Test framework can't find SPI implementations from module sandbox

2023-11-21 Thread Michael Sokolov
did you add to the sandbox META-INF file? It looks like maybe sandbox is not included in the scope of the test, but you didn't say which test it was. Is the test also in the sandbox module? On Mon, Nov 20, 2023 at 6:56 PM Dongyu Xu wrote: > Hi devs, > > I tried to plug in my experimental Posting

Re: Welcome Patrick Zhai to the Lucene PMC

2023-11-12 Thread Michael Sokolov
Welcome, Patrick! On Sun, Nov 12, 2023, 2:12 AM Ignacio Vera wrote: > Welcome Patrick! > > On Sat, Nov 11, 2023 at 3:29 PM Uwe Schindler wrote: > >> Welcome Patrick! >> >> Uwe >> >> >> Am 10. November 2023 21:04:32 MEZ schrieb Michael McCandless < >> luc...@mikemccandless.com>: >> >>> I'm happy

Re: Boolean field type

2023-11-09 Thread Michael Sokolov
Can you require the user to specify missing: true or missing: false semantics. With that you can decide what to do with the missing values On Thu, Nov 9, 2023, 7:55 AM Mikhail Khludnev wrote: > Hello Michael. > This optimization "NOT the less common value" assumes that boolean field > is require

Re: Bump minimum Java version requirement to 21

2023-11-06 Thread Michael Sokolov
It's not just you - we have an internal JDK11 fork at BIG COMPANY for some folks that can't get off the stick. To be fair it's challenging because they have to shift all their dependencies. I think Spark was the one mentioned by one group, but there is a JDK17-based release of Spark, so clearly not

  1   2   3   4   5   6   >